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ABSTRACT 

Heterochromatin protein 1a (HP1a) is a chromatin- 
associated protein important for the formation and 
maintenance of heterochromatin. In Drosophila, 
the two histone methyltransferases SETDB1 and 
Su(var)3-9 mediate H3K9 methylation marks that 
initiates the establishment and spreading of 
HP1a-enriched chromatin. Although HP1a is gener- 
ally regarded as a factor that represses gene tran- 
scription, several reports have linked HP1a binding 
to active genes, and in some cases, it has been 
shown to stimulate transcriptional activity. To 
clarify the function of HP1a in transcription regula- 
tion and its association with Su(var)3-9, SETDB1 
and the chromosome 4-specific protein POF, we 
conducted genome-wide expression studies and 
combined the results with available binding data in 
Drosophila melanogaster. The results suggest that 
HP1a, SETDB1 and Su(var)3-9 repress genes on 
chromosome 4, where non-ubiquitously expressed 
genes are preferentially targeted, and stimulate 
genes in pericentromeric regions. Further, we 
showed that on chromosome 4, Su(var)3-9, 
SETDB1 and HP1a target the same genes. In 
addition, we found that transposons are repressed 
by HP1a and Su(var)3-9 and that the binding level 
and expression effects of HP1a are affected by 
gene length. Our results indicate that genes have 
adapted to be properly expressed in their local 
chromatin environment. 

INTRODUCTION 

The eukaryotic genome is organized into a DNA- and pro- 
tein-containing structure known as chromatin. Two major 



types of chromatin have been defined: euchromatin, which 
contains actively expressed genes and remains de- 
condensed throughout the cell cycle, and heterochromatin, 
which remains condensed and is generally associated with 
repression and inactive genes (1,2). Recently, a more 
specific classification of chromatin was introduced (3), in 
which the 'classical' gene-poor, repeat-rich and recombi- 
nationally silent heterochromatin (4) best corresponds to 
'GREEN chromatin'. GREEN chromatin is enriched in 
di- and trimethylation of lysine 9 on histone 3 (H3K9me2 
and H3K9me3) and the transcription-repressing 
heterochromatin protein la (HPla), which binds to 
H3K9me through its chromodomain (5). In Drosophila 
melanogaster, the H3K9me2 and H3K9me3 methylation 
marks are mainly controlled by at least two different 
histone methyltransferases (HKMTs), i.e. SETDB1 
(6-10) and Su(var)3-9 (10-12). The general idea of hetero- 
chromatin formation is that two H3K9me-bound HPla 
molecules (13-17) interact through their chromo shadow 
domain, forming a dimer that links two adjacent nucleo- 
some molecules together (18,19) and resulting in methyla- 
tion of the neighboring nucleosome through the 
interaction of HKMTs with the chromo shadow domain 
of the bound HPla (11). This initiates a spreading mech- 
anism that causes the chromatin to become condensed and 
inactive (5,13). Recently, it has been proposed that HPla 
initially binds strongly to the promoters of active genes 
independently of H3K9me, forming a nucleation site for 
further H3K9me-dependent spreading of HPla along the 
gene body (10). GREEN chromatin, and consequently 
the HPla binding sites, in the D. melanogaster genome 
are primarily located in the pericentromeric regions and 
on the 4th chromosome (3). The pericentromeric regions 
are heterochromatic regions adjacent to the centromeres, 
and H3K9me2 and me3 in these regions are mainly 
mediated by Su(var)3-9 (10,11). 

The 4th chromosome is associated with several hetero- 
chromatic markers (including HPla) and a high content 
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of repetitive elements and transposable elements, but still 
the chromosome contains multiple active genes that are 
interspersed between the repetitive elements, rendering 
it a gene density and expression output similar to that 
of euchromatic sites, reviewed by Riddle and colleagues 
(20,21). 

Both SETDB1 and Su(var)3-9 are associated with the 
4th chromosome, but it is primarily SETDB1 that is 
responsible for mediating H3K9 methylation in this 
region (6,7,9,10). The nature of the 4th chromosome 
presents a challenging situation for the cell, in which the 
transposons must be kept in a silent state at the same time 
as the embedded genes must remain active. One factor that 
has an important role in maintaining gene expression 
within this repressive environment is the chromosome 
4-specific protein POF (Painting of Fourth), which specif- 
ically binds to nascent RNA from actively transcribed 
genes and increases expression output (22-28). ChlP- 
chip experiments have shown that HP la and POF are 
interdependently associated with the gene body of these 
active genes on the 4th chromosome (3,10,29-32), and 
together they exert opposite effects on gene expression, 
creating a tightly balanced mechanism for gene regulation 
(7,23). 

HP la binding has also been found to be dispersed at a 
number of euchromatic sites (33), where it binds to the 
gene body of active genes (31,34). The cytological region 
2L:31 is the most distinct of these euchromatic sites 
(10,30,32,33,35,36). H3K9 methylation in this region 
appears to be mediated mainly by SETDB1 (10). 

In line with HP la's importance in heterochromatin 
formation and gene repression, an RNAi-mediated 
knock-down of HP la has been shown to be associated 
with increased expression of genes located on the 
4th chromosome (23,37). Interestingly, several expression 
studies have reported contradictory results, indicating that 
HP la has an activating function on gene expression; 
different euchromatic genes have been shown to be 
down-regulated in HPla mutants (9,34,35,38) and in 
RNAi knock-down experiments (39,40). To be properly 
expressed, genes in the pericentromeric regions have 
been shown to depend on HPla and the heterochromatic 
background, as exemplified by the genes light, rolled, 
RpL15 and Dbp80 (41-46). Furthermore, detailed 
mapping studies of HP1 a binding sites in euchromatin 
have shown its enrichment at developmentally regulated 
genes and at heat shock-induced chromosomal puffs, 
which are regions with high gene activity (34). 

To clarify the role of H3K9me2, me3, HPla and POF in 
gene transcription, we conducted genome-wide expression 
studies and combined the results with binding data to 
investigate the targeting and expression effects of HPla 
in the three different HPla binding regions, i.e. chromo- 
some 4, pericentromeric regions and cytological section 
2L:31. We here show that HPla has a repressing 
function on chromosome 4, where it preferentially 
targets non-ubiquitously expressed genes (NUEGs), and 
an activating function in the pericentromeric regions, 
whereas, on average, region 2L:31 is unaffected. The 
effects of SETDB1 and Su(var)3-9 are similar to HPla, 
and on chromosome 4, Su(var)3-9, SETDB1 and HPla 



essentially target the same genes. Furthermore, we found 
that HPla binding and function correlates with gene 
length, with longer genes being more repressed. Within 
the pericentromeric regions, we observed that genes that 
are closer to the proximal end of the chromosome are 
more strongly bound and stimulated by HPla. 

MATERIALS AND METHODS 

Fly strains and genetic crosses 

Flies were cultivated and crossed at 25°C in vials contain- 
ing potato mash-yeast-agar. Strain y w; Pof 0 " 9 jCyO was 
previously generated in our laboratory (23). To generate 
Su(var)3-9 nulls, we constructed /ra«s-heterozygotes for 
the two alleles Su(var)3-9"° (10,23,47) and Su(var)3-9 06 
(11,48). The Su(var)2-5°\ Su(var)2-5 05 (HPla), 
Su(var)3-9 evo and Su(var)3-9° 6 strains were obtained 
from Victor Corces (Johns Hopkins University, 
Baltimore). Setdhl 10 ' 1 and the hemagglutinin-tagged 
SETDB1 encoding strain (Setdbl 3HA ) used for polytene 
staining were obtained from Carole Seum (University of 
Geneva) (6). Oregon R was used as the wild-type strain, 
and three replicates of 200 first-instar larvae from each 
mutant (six replicates of wild-type) were collected from 
yeast prepared apple-agar plates ~48 h after egg laying 
and then frozen at — 80°C. 

Immunostaining of polytene chromosomes 

Immunostaining of polytene chromosomes was essentially 
as described previously (22). We used primary antibodies 
against POF [chicken, 1:100 dilution (24) or rabbit, 1:400 
(25)], HPla (PRB291C, 1:400, Covance) and ocHA (MMS 
101R, 1:100, Covance, for detection of SETDB1.3HA). 
Goat anti-rabbit, anti-chicken and anti-mouse conjugated 
with Alexa-Fluor555 or AlexaFluor488 (1:300, Molecular 
Probes) were used as secondary antibodies. 

Microarray analysis 

For microarray analysis, total RNA from Drosophila first- 
instar mutant larvae and wild-type control larvae was 
isolated using TRIzol reagent (Invitrogen), followed by 
purification using RNeasy Mini Kit (Qiagen) according 
to the manufacturers' protocols. The labeled cDNA 
probes were then hybridized to an Affymetrix Drosophila 
gene chip (version 2), and the intensity values were 
normalized and summarized using a robust multi-array 
analysis in R (www.R-project.org) and the Bioconductor 
package (49). The resulting data are available at http:// 
www.ncbi.nlm.nih.gov/geo/ (accession number: 
GSE43478). 

Calculations and treatment of microarray data sets 

We used data for the six wild-type replicates in combin- 
ation with that for the three re-analyzed wild-type repli- 
cates (23) to exclude genes that were unstably detected 
(standard deviation of 9 wild-type replicates > 1 on a 
log 2 scale). Next, we removed all un-expressed genes, i.e. 
genes where median expression of the replicates of all 
mutant data sets and median expression of the wild-type 
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replicates were lower than 6 on a log2 scale. For partially 
un-expressed genes (i.e. genes where one or more replicate 
expression values of any of the mutant and/or wild-type 
conditions were <6), all replicate values <6 were set to 6. 
Next, the data were scaled by adding an array-specific 
constant to all the mutant array expression values so 
that the total genomic expression on all mutant data sets 
matched that of the wild-type, as previously described 
(50). The relative expression ratio for each gene was 
calculated from the median of the three mutant replicates 
minus the median of the wild-type replicates. The 
re-analyzed Pof mutant and wild-type data were treated 
as previously, but (apart from the first step of removing 
unstably detected genes) these data were analyzed separ- 
ately from the other mutant and wild-type data. 
Ubiquitously expressed genes (UEGs) and NUEGs were 
defined as described previously (28). 

Gene binding data and average gene profiles 

Calculated average binding values of the exons of all ex- 
pressed genes and calculated average meta-gene binding 
profiles of all expressed genes were based on ChlP-chip 
data for POF and HP la (10) and Su(var)3-9 data from 
modENCODE (third-instar larvae nuclei) (51). Gene 
binding values for all genes and HP la average profiles 
were calculated as described previously [(29) and (10), 
respectively]. HP la binding values that correlated with 
gene length and genomic position were calculated based 
on modENCODE HPla binding data (51) for genes 
expressed to levels >6 in salivary gland tissue of the 
FlyAtlas database (52). Gene expression values were 
calculated from RNA-seq analysis results of third-instar 
larval salivary gland tissue (53). 

Data handling and statistical analysis 

All statistical analyses were performed on log 2 -scaled data 
using Statsoft Statistica 10.0 or Excel 2010. The statistical 
test Wilcoxon matched pairs test was used to measure 
significant differences between the average expression 
level (median of three replicates) of all expressed genes 
in the mutants compared with the average expression 
level of the wild-type. (i.e. not between the mutant to 
wild-type expression ratio and 0). 

RESULTS 

Genome-wide binding of HPla, POF and SETDB1 

To study the binding patterns of SETDB1, Su(var)3-9, 
HPla and POF, we performed immunostaining of 
D. melcmogaster polytene chromosomes to investigate a 
number of different genomic regions with HP la-enriched 
'GREEN chromatin' properties, i.e. chromosome 4, 
pericentromeric regions (regions proximal to the centro- 
mere on chromosomes 2L, 2R and 3L) and cytological 
region 2L:31 (10,30,32,33,35,36). As expected, POF, 
which has been shown to bind exclusively to chromosome 
4 (23,24), shows some overlap with HPla binding on the 
4th chromosome but not in the pericentromeric regions, 
where only HPla binds (Figure 1A-C). Interestingly, 



despite the high specificity for chromosome 4 genes, we 
occasionally observed POF binding in parts of the 2L:31 
region (Figure IB). In addition to the co-localization of 
HPla and POF (Figure 1C), we also observed a clear 
co-localization between POF and SETDB1 on chromo- 
some 4 (Figure ID). In contrast to HPla binding, 
neither POF nor SETDB1 displayed any binding to the 
pericentromeric regions or the most distal part of the 4th 
chromosome (Figure 1C and D). Thus, POF, HPla and 
SETDB1 appear to bind to the same locations on chromo- 
some 4. HPla and SETDB1 have previously been shown 
to co-localize on the 4th chromosome (6,7,9,10,54). To 
compare the relative binding levels of the proteins 
between these regions, we re-analyzed binding data from 
ChlP-chip experiments for POF (10), HPla (10) and 
Su(var)3-9 (51) and calculated the average binding levels 
on exons of all actively transcribed genes. As expected, 
POF displayed clear binding only to chromosome 4 
(Figure IE), whereas HPla was associated with both the 
4th chromosome and pericentromeric regions, and more 
weakly bound to the 2L:31 region (Figure IF). 
Interestingly, Su(var)3-9 showed the strongest binding 
to chromosome 4 compared with the other regions 
(Figure 1G). Su(var)3-9 has previously been shown to 
bind to the 4th chromosome to some extent, but in 
contrast to SETDB1, has a minor effect on methylation 
patterns at this site (10-12,55). We conclude that HPla, 
POF and SETDB1 binding overlap on the 4th chromo- 
some, and in addition, HPla and Su(var)3-9 bind to 
the pericentromeric regions and distal end of the 4th 
chromosome. 

HP1 inhibits gene expression on the 4th chromosome and 
induces gene expression in pericentromeric regions 

To study the effects on regulation of gene transcription, 
we prepared total RNA from first-instar larvae of trans- 
heterozygous HPla 04 /HPla 05 fra/w-heterozygous 
Su(var)3-9 evo /Su(var)3-9 06 and homozygous Setdbl 101 / 
Setdbl 10 ' 1 mutants and ?nmv-heterozygous HPla 04 
Pof" 9 IHPla 05 Pof 1,9 double-mutants. The HPla Pof 
double-mutant was included to study the effects on 
chromosome 4 expression in a case where both compo- 
nents (HPla and POF) of the proposed chromosome 4 
balancing system were lost (29). Total RNA was 
prepared from three replicates of first-instar larvae of 
each mutant and from six replicates of wild-type. The 
RNA was converted to cDNA and hybridized to an 
Affymetrix expression array. 

The wild-type replicates were used to eliminate genes 
that were unstably expressed under wild-type conditions 
due to biological and/or technical reasons. All genes with 
a wild-type standard deviation >1 (log 2 scale) (in total 531 
genes) and all genes with non- or sub-detectable expres- 
sion, i.e. with a median expression level <6 in all mutant 
data sets and the wild-type data set [as previously 
described (28,50)], were excluded from further analysis. 
We next calculated the expression ratios between the 
median values of three mutant replicates against the 
median values of the six wild-type replicates for each 
gene in the different mutant data sets. In addition to the 
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Figure 1. HPla, SETDB1 and POF binding overlaps on chromosome 4, occasionally POF overlaps with HPla binding on region 2L:31. (A) HPla 
(green) and POF (red) localization on a whole wild-type polytene chromosome. (B) Close-up image of pericentromeric, chromosome 4 and 2L:31 
regions. The arrow indicates chromosome 4, arrow head indicates pericentromeric region and asterix indicates cytological region 2L:31. (C) POF and 
HPla binding on chromosome 4. (D) POF and HA (for detection of HA-tagged SETDB1) staining in red and green, respectively, on chromosome 4 
in a SETDB1-3HA third-instar larva. The arrow indicates chromosome 4 and arrow head indicates pericentromeric region. DNA is stained with 
DAPI (blue). (E-G) Mean exon binding value (log 2 scale) of POF (E), HPla (F) and Su(var)3-9 (G) for all active genes within chromosome 4, 
pericentromeric regions, 2L:31 region and control region (whole chromosome 3R) (;;=50, 68, 56 and 1753, respectively). Dashed lines represent 
binding levels in the control region (chromosome 3R) and error bars indicates the 95% confidence interval. 
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newly generated data, we re-analyzed expression data 
from homozygous Pof D119 / Pof 0119 mutant first-instar 
larvae and from the corresponding wild-type first-instar 
larvae (three replicates each) (23). Notably, in contrast 
to the HPla Pof mutant (and the other mutants 
analyzed in this study), the Pof mutant had no maternal 
contribution of the POF protein. The Pof mutant and 
wild-type were treated in the same way as described pre- 
viously, and expression ratios were calculated for all 
genes. 

To analyze the role of the different proteins in gene 
expression regulation within chromosome 4, region 
2L:31 and the pericentromeric regions, we calculated the 



average expression ratios for the mutants versus wild-type 
within each region. In line with previous reports 
(22,23,28), removing POF resulted in a significant reduc- 
tion of chromosome 4 gene expression (—0.3 on the log 2 
scale or 81% of wild-type expression) (Figure 2A), 
indicating that POF exerts a stimulating effect on the 
4th chromosome. The HPla mutant displayed an 
increased expression level of chromosome 4 genes (0.16 
on the log2 scale or 112% of wild-type expression), 
which is in line with previous results (23). In the case of 
the HPla Pof double-mutant, where both components 
of the chromosome 4 expression balancing system were 
absent, no significant change was observed. The Setdbl 
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Figure 2. HP1 inhibits gene expression on the 4th chromosome and induces gene expression in pericentromeric regions. The mean expression ratio 
(log 2 scale) detected by Affymetrix expression arrays in Pof HPla, HPla Pof, Setdbl and Su( var)3-9 mutants versus wild-type in (A) chromosome 4, 
(B) pericentromeric regions, (C) cytological region 2L:31 and (D) control (all active genes, except within the three tested regions). Squares indicate the 
mean value and whiskers indicate the 95% confidence interval. Dashed lines indicate no change in expression ratio. Wilcoxon matched pairs test was 
used to estimate the significant difference between the average absolute mutant expression level and average absolute wild-type expression level. 
** indicates significant at P<0.01 and * indicates significant at / > <0.05. (Sample sizes for the Pof mutant were « 0 h r 4 = 69, « pe riceniromeric = 79, 
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and Su( var)3-9 mutants both displayed increased 
chromosome 4 expression levels (0.07 and 0.1 on the 
log 2 scale, respectively), but the effects were less 
pronounced than in the HPla mutant (Figure 2A). 

Interestingly, in the pericentromeric regions, HPla dis- 
played an opposite trend to that on the 4th chromosome; 
the expression level was reduced in the HPla mutant, 
indicating a stimulatory function of HPla on gene expres- 
sion. A lack of Su(var)3-9, the protein mediating H3K9me 
in this region, also caused reduced gene expression, but the 
effect was less pronounced compared with the HPla 
mutant. As expected, the Setdbl mutant did not show 
any effect on gene expression in the pericentromeric 
regions (Figure 2B). In region 2L:31, the Su(var)3-9 
mutant displayed a reduction in the average expression 
ratio by ~— 0.15 on the log 2 scale (Figure 2C). 
Surprisingly, the Pof mutant had a weak, but significant, 
up-regulating effect in both the pericentromeric regions 
(Figure 2B) and the 2L:31 region (Figure 2C). 

We conclude that the effect of HPla on gene expression 
depends on the genomic region; on average, HPla had 
a repressing effect on chromosome 4 active genes, a 
stimulating effect on pericentromeric active genes and a 
slightly stimulating effect on active genes in region 2L:31. 
In general, the 2L:31 region responded in an opposite 
manner to the 4th chromosome. Furthermore, we 
observed that SETDB1 and Su(var)3-9 always affected 
gene expression in the same direction. 

SETDB1 and Su(var)3-9 display extensive overlapping 
functions at both genome-wide and chromosome 4 levels 

We were surprised by the comparable directional effects of 
SETDB1 and Su(var)3-9 on gene regulation, as they are 
known to control H3K9 methylation mainly in different 
regions. Therefore, we wanted to identify individual genes 
that are co-regulated by HPla, SETDB1 and/or Su(var)3- 
9 (i.e. genes that are differentially affected in expression 
level in more than one mutant) on both a genome-wide 
and regional level. Genes were defined as differentially 
expressed if all the replicates of a particular mutant ex- 
hibited expression values higher or lower than all the wild- 
type replicates and if the median expression ratio between 
the mutant and wild-type was higher or lower than ± 0.2 
on the log 2 scale, respectively. To reduce the possibility 
of false-positives caused by normal variation in the wild- 
type samples, the six wild-type replicates were first divided 
into two sets of three replicates so that different wild-type 
sets could be used when comparing the two mutant 
conditions. 

The numbers of differentially expressed genes that 
overlapped in the different mutants are shown as Venn 
diagrams in Figure 3A and B. In line with the results in 
Figure 2, the overlap between HPla, Setdbl and 
Su(var)3-9 on a whole-genome level was substantial for 
both down-regulated (i.e. stimulated by encoded proteins 
in wild-type) and up-regulated (i.e. inhibited by encoded 
proteins in wild-type) genes (Figure 3A), indicating that 
SETDB1 and Su(var)3-9 mainly affect the same genes, 
although the Su(var)3-9 mutant had more down-regulated 
genes than the Setdbl mutant. It is noteworthy that on a 
whole-genome level, the overlap was greatest for 



down-regulated genes (i.e. stimulated by encoded 
proteins in the wild-type), whereas on chromosome 4 
(where HPla, SETDB1 and Su(var)3-9 are known to 
have a repressing effect on gene expression), the overlap 
was greatest for up-regulated genes (i.e. inhibited by 
encoded proteins in the wild-type). In addition, almost 
all the up-regulated genes on chromosome 4 showed a 
strong overlap between the Su(var)3-9, Setdbl and 
HPla mutants, indicating they all target and repress the 
same set of genes (Figure 3B). 

HPla and Su(var)3-9 repress transposon-derived 
transcripts 

HPla and Su(var)3-9 have been suggested to inhibit 
expression of transposons (56). Therefore, we investigated 
the expression of transposons in HPla, Su(var)3-9 and 
Setdbl mutants. The majority of the differentially 
expressed transposons (transposons and retrotransposons 
as defined by Affymetrix Drosophila gene chip, version 2) 
were up-regulated in the HPla and Su(var)3-9 mutants, 
whereas the Setdbl mutant affected fewer transposons 
(Figure 3C). To obtain a more general view of how 
transposons are regulated, we determined the average 
expression ratio for all expressed transposons within the 
whole genome. We found that HPla (Figure 3D), HPla 
Pof (Figure 3E) and Su( var)3-9 (Figure 3F) mu- 
tants all exhibited significantly increased transposon 
expression levels (0.6-0.7 on the log 2 scale), whereas 
Setdbl and Pof mutants had no effect on transposon 
expression (Figure 3G and H). We concluded that 
HPla and Su(var)3-9 repress transposon-derived RNA 
expression. 

HPla preferentially represses NUEGs on chromosome 4 
and stimulates expression of UEGs in pericentromeric 
regions 

To test if the opposite gene expression effects could be 
connected with different gene types, we examined UEGs 
and NUEGs, as defined in a previous study in which POF 
was found to preferentially target NUEGs on chromo- 
some 4 (28). The average expression ratio for each 
mutant was calculated for UEGs and NUEGs on chromo- 
some 4, in the pericentromeric regions and in cytological 
region 2L:31 (Figure 4A-C). As observed previously (28), 
we found that POF on chromosome 4 mainly targets 
and stimulates expression of NUEGs (Figure 4A). 
Furthermore, we observed that the repressing effects of 
HPla on chromosome 4 was also stronger for the 
NUEGs, as expected, as POF and HPla are known to 
bind to essentially the same genes on chromosome 4. 
However, the difference in expression ratio between the 
NUEGs and UEGs was not significant. 

A similar trend was observed for HPla Pof SETDB1 
and Su(var)3-9 mutants (Figure 4A). Notably, although 
the effects were small for the pericentromeric regions 
(Figure 4B), an opposite trend was observed, i.e. UEGs 
were more affected than the NUEGs in all mutants (even 
Pof affected expression in the opposite direction as the 
other mutants) (Figure 4B). The same trend was not 
observed in the 2L:31 region (Figure 4C). 



Nucleic Acids Research, 2013, Vol. 41, No. 8 4487 



HP1a 




Whole genome 

SetdM HPla 




Su(var)3-9 



540 



Su(var)3-9 



B 



Chr4 




Setdbl 



HP1a 



Su(var)3-9 



Setdbl 



18 



Su(var)3-9 



HP1a 



Transposons 



Setdbl HP1a 




Su(var)3-9 




Setdbl 



Su(var)3-9 



1.2 
t 0.8 

■2 0 4 
1 0.0 



of* 



4? 



1.2 



T 

I £o. 



| o.o 



I 



4? 



1.2 

§0.8 

%0.4 
2 

^ro.o 



I 



1.2 
0.8 



« 0.4 
§0.0 



H 



1.2 
0.8 
0.4 



- J £ o.o 



- 1 



Figure 3. SETDB1 and Su(var)3-9 display extensive overlapping functions at both genome-wide and chromosome-4 levels. Differentially up- or 
down-regulated genes from each mutant are compared in a Venn diagram to identify genes that are co-regulated by HPla, SETDB1 and Su(var)3-9. 
Genes were defined as differentially up- or down-regulated if none of the three mutant replicates overlapped with any of the three wild-type replicates 
and if the ratio between mutant and wild-type median values was greater than 0.2 (log 2 scale). (A-C) Differentially down-regulated genes (left 
diagram) or up-regulated genes (right diagram) in the whole genome (A), on chromosome 4 (B) and transposons (C). (D-H) Mean expression ratio 
(log 2 scale) for transposons (n = 32 for Pof mutant and = 40 for the other mutants) compared with the rest of the genome (;; — 9318 for Pof 
mutant and n = 9547 for the other mutants) in HPla (D), HPla Pof (E), Su(var)3-9 (F), Setdbl (G) and Pof (H) mutants compared with the median 
of six wild-type replicates (three wild-type replicates for the Pof mutant). Whiskers indicate the 95% confidence interval. 



We therefore conclude that POF and HPla preferen- 
tially target NUEGs on chromosome 4, where POF stimu- 
lates and HPla inhibits gene expression, whereas in the 
pericentromeric regions, HPla stimulates gene expression 
with a slight preference for UEGs. 

Next, we investigated whether the opposite gene expres- 
sion effects of HPla on NUEGs and UEGs could be con- 
nected with HPla binding preferences. Therefore, we 
re-analyzed HPla binding data from (10) and made the 
interesting observation that HPla on chromosome 4 binds 
stronger to the promoter than to the gene body in UEGs 
compared with NUEGs. In the pericentromeric regions, 
we measured stronger promoter binding than gene body 
binding for both NUEGs and UEGs, although the differ- 
ence was more pronounced for UEGs. 



HPla inhibits gene expression in long genes and 
stimulates expression in short genes 

We have previously shown that the degree of buffering of 
gene expression in segmental monosomies correlates to 
gene length (50) and hypothesized that the repressing 
role of HPla is connected with HPla binding in the 
gene body (10). Therefore, we surmised that the gene 
length might correlate to the repressing effect of HPla. 
Hence, we plotted the gene length versus expression ratios 
in HPla - wild-type and found a clear positive correl- 
ation; the longer the gene, the more up-regulated it was 
in the HPla mutant, whereas the shortest genes (~2 kb 
and shorter) were repressed in the HPla mutant 
(Figure 5A). This correlation was not observed for the 
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Figure 4. HPla preferentially inhibits NUEGs on chromosome 4 and stimulates expression of UEGs in pericentromeric regions. The mean expres- 
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Squares indicate the mean value and whiskers indicate the 95% confidence interval. Dashed lines indicate no change in expression ratio. 
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other mutants (data not shown). However, HPla binding 
levels (average binding value of exons of expressed genes) 
from a HPla ChIP experiment on salivary glands did 
not correlate with gene length, implying that the HPla 
binding density per length unit was the same for short 
and long genes (Figure 5B). Furthermore, the correlation 
seen between HPla expression ratio and gene length was 
not associated with any correlation between gene length 
and wild-type expression levels (Figure 5C). The deviating 
behavior of the shortest gene length bin (<0.5 kb) may be 
because a large proportion of this group consists of probe 
sets that are undefined by Affymetrix (Supplementary 
Figure SI A), and are thus likely to be non-coding tran- 
scripts or repetitive elements. The low wild-type expres- 
sion of this bin (Figure 5C) also indicates that they are 
situated within repressed chromatin. When looking at all 
gene lengths, the undefined probe sets were significantly 
up-regulated in the HPla mutant (Supplementary Figure 
SIB), indicating that HPla is likely to be involved in re- 
pressing these repetitive and/or non-coding elements. 
Notably, the fact that region 2L:31, despite having 
HPla binding (Figure 1), did not show the same effects 
as observed for chromosome 4 and the pericentromeric 



regions could potentially be explained by this gene 
length effect, as the average gene length in region 2L:31 
was significantly shorter than in the other regions 
(Figure 5D). However, as shown in Figure 1, the 
binding levels of HPla were lower in region 2L:31 genes 
compared with the 4th chromosome and pericentromeric 
region genes. 

We conclude that the effects on the expression ratio 
in HPla mutants are dependent on gene length, i.e. long 
genes are repressed by HPla, whereas short genes appear 
to be stimulated, and HPla binds with the same density 
per length unit irrespective of the size of the gene. As 
expected, the shortest gene length bin, over-represented 
by transcripts from repetitive elements and non-coding 
RNAs, was repressed by HPla. 

Distance of genes from the centromere within 
pericentromeric regions affects the level of 
HPla regulation 

We found that different gene types (UEGs and NUEGs) 
and different gene lengths affected the level of HPla 
binding and regulation. Next, we investigated whether 
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Figure 5. HPla inhibits gene expression of long genes and stimulates expression of short genes. (A) Mean plot of expression ratio in an HPla 
mutant versus wild-type plotted against bins of gene length (defined as length between transcriptional start and stop sites in kb, sample number is 
from lowest to highest bins; 473, 1255, 1321, 1189, 897, 662, 930, 592, 425, 565, 348 and 1405, respectively). (B) Mean plot of average HPla binding 
values of all genes plotted against bins of gene length. (C) Mean plot of wild-type expression value plotted against bins of gene length. (D) Average 
gene length (of the active genes that are included in this study) in the different HPla binding regions: chromosome 4 (n = 78), pericentromeric 
regions (n = 88), cytological region 2L:31 (n = 72) and in the remaining euchromatic genome (;; = 9824). All v-axis values are on a log 2 scale and 
whiskers indicate the 95% confidence interval. 



the position of genes in relation to centromeric hetero- 
chromatin was connected with HPla function. To study 
this, we plotted the expression ratio in HPla mutants 
against the position of genes within the regions of 
interest and found that in the pericentromeric regions of 
chromosome 2L and chromosome 3L, the effect of HPla 
on gene expression correlated positively with the distance 
between the transcriptional start site of the gene and the 
heterochromatic centromere region (proximal end of 
chromosome arm according to sequence release 5.43) 
(Figure 6A). This correlation was not observed for 
pericentromeric genes on chromosome 2R (result not 
shown), and chromosome 3R was excluded because it 
has no defined HP la-bound pericentromeric region (30). 



To test whether the binding of HPla also correlates with 
distance from centromere, HPla binding levels of individ- 
ual active genes were plotted against position of transcrip- 
tional start site within the 2L and 3L pericentromeric 
regions. The results showed a significant negative correl- 
ation between binding and distance from the centromere 
(Figure 6B). 

We conclude that for genes within the 2L and 3L peri- 
centromeric regions, HPla binding increases in strength 
with increasing proximity of the genes to the heterochro- 
matic centromere. Further, because HPla has an acti- 
vating function in the pericentromeric regions, these 
proximal genes are likely to be more transcriptionally 
stimulated than the distal ones. 
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P = 0.004). 



DISCUSSION 

Chromosome 4 is bound by HPla, POF, SETDB1 and 
Su(var)3-9 

Heterochromatin protein 1 is a protein that has been 
well-studied in many model organisms, including 
Schizosaccharomyces pombe, mouse and D. melanogaster . 
Although D. melanogaster HPla is best known for its role 
in heterochromatin formation and silencing, several 
reports have linked HPla to regulation of transcriptional 
activity of heterochromatic and some euchromatic genes 
(5,57). We asked if these conflicting results could partly be 
explained by a region-specific function of HPla and the 
proteins involved in HPla binding, i.e. SETDB1, 
Su(var)3-9 and POF (9,10,23). Based on polytene chromo- 
some staining, it was clear that POF, HPla and SETDB1 
overlapped on chromosome 4 but not on the peri- 
centromeric section or on the most distal part of the tip, 
which was only bound by HPla. These POF and SETDB1 
unbound regions also correspond to regions that are inde- 
pendent of SETDB1 for maintaining a proper H3K9me2 
and me3 pattern (10). In line with previous studies (3,51), 
we found that Su(var)3-9 binds to chromosome 4 when 
considering expressed genes, and more interestingly, 
this binding to active genes is, on average, stronger than 
the binding of Su(var)3-9 to active genes in the peri- 
centromeric regions, although loss of Su(var)3-9 had 
minor effects on the methylation pattern of chromosome 
4. The putative function of Su(var)3-9 on chromosome 4 
therefore remains elusive. 

In addition to the persistent binding of POF to chromo- 
some 4, it is interesting to note the presence of occasional 
binding to region 2L:31. It is known that POF binds to 
HPla binding sites where HPla binding is dependent on 
SETDB1, and as we have previously shown that binding 
of HPla in region 2L:31 is dependent on SETDB1 (10), 
this could partially explain the sporadic binding of POF in 
this region. Region 2L:31 displayed similar properties to 
other euchromatic regions that are unbound by SETDB1 



and HPla (35). Thus, the reason for the targeting of this 
particular region remains to be explained. 

HPla has opposite functions on chromosome 4 and in 
pericentromeric regions 

HPla has long been known for its repressive function. It 
was initially identified as a dominant suppressor of 
position-effect variegation and was named Su(var)205 
(2,58-61), and we have previously reported that HPla 
represses gene expression on chromosome 4 (23). 
However, several studies have reported an activating 
function of HPla (9,34,35,38-40). Our current study 
suggests that these conflicting reports can at least be 
partly explained by our observation that HPla has differ- 
ent functions in different regions; chromosome 4 genes 
are, on average, repressed, whereas pericentromeric 
genes are stimulated. We therefore believe that it is im- 
portant to look at different groups of genes when studying 
the effects of HPla. Otherwise, these opposing effects may 
cancel each other out on a genome-wide level. 

Nevertheless, the conflicting results cannot be fully ex- 
plained by our findings. For example, Schwaiger et al. (40) 
also studied chromosome 4 effects and found that tran- 
scription was reduced in an RNAi-mediated HPla knock- 
down, in contrast to our results. Therefore, technical dif- 
ferences between experiments should also be considered; 
in our study, we cannot exclude the possibility of a 
maternal contribution of HPla, as we studied mutants 
in first-instar larvae from heterozygous mothers, and it 
is thus likely that we have a reduction in HPla levels 
rather than complete removal. It has been shown that 
maternal HPla contributes to ~20% of the HPla 
protein found in heterozygous mutant first-instar larvae 
(44). Like others, we have previously shown that the 
average level of gene expression of chromosome 4 is com- 
parable with, or even higher than, that of genes on other 
chromosomes (10,62). At least to some extent, this is a 
consequence of POF-mediated stimulation of gene 
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expression output, which counteracts the repressing 
nature of the 4th chromosome (22,23,28). We hypothesize 
that due to POF and other factors, genes on the 4th 
chromosome have evolved to be functional in this repres- 
sive GREEN-chromatin environment. A decrease in HP la 
is mainly expected to cause a reduction of the low affinity 
binding of HP la in the gene body, and consequently a 
de-repression of gene expression. However, prolonged 
loss or very strong depletion of HP la will most 
probably have dramatic effects on the overall structure 
of chromosome 4 chromatin, and thus lead to a dysfunc- 
tional chromatin structure with decreased gene expression. 
This implies that the genes have adapted to be properly 
expressed in the local chromatin environment. 

As we have previously shown, POF is involved in 
stimulating expression of active genes on chromosome 4. 
The observed effects in the Pof mutant on genes in the 
pericentromeric regions and region 2L:31 are most likely 
explained by indirect effects when HP la is being 
redistributed from chromosome 4 to other binding sites, 
as binding of HP la to chromosome 4 is dependent on the 
presence of POF (9,10,23). The increased transcriptional 
output of chromosome 4 genes in the Setdbl mutant is 
likely due to loss of the repressive methylation marks, 
which in turn will reduce HPla binding (6,7,9,10). 
Although we know that HPla binding to promoters is 
independent of methylation marks, it is possible that 
HPla binding remains in promoters, where it exerts an 
activating function. 

The increased chromosome 4 expression observed in the 
Su( var)3-9 mutant is surprising but could be explained by 
indirect effects; we speculate that when Su(var)3-9 is lost 
from the pericentromeric regions, SETDB1 is redirected 
from chromosome 4 to sustain normal H3K9 methylation 
in the pericentromeric regions, thus decreasing HPla 
binding to chromosome 4. This could explain why both 
the Setdbl and the Su(var)3-9 mutants give such similar 
up-regulating effects on chromosome 4 expression. An al- 
ternative explanation for this effect is that the observed 
binding of Su(var)3-9 to chromosome 4 has a yet- 
unknown repressing function independent of the HKMT 
function of Su(var)3-9. 

The HPla Pof double-mutant displayed weak non- 
significant up-regulation of chromosome 4, with margin- 
ally larger error bars than for the HPla mutant, which 
supports the suggested balancing mechanism of chromo- 
some 4, where HPla and POF fine-tune the transcriptional 
output; in the absence of both components, the overall 
expression will not change but individual genes will start 
losing proper transcriptional control. 

Functions of SETDB1 and Su(var)3-9 are less 
complementary than previously suggested 

Although previous studies have indicated that SETDB1 
and Su(var)3-9 have separate main targets, our data 
show that the majority of genes that are up- or 
down-regulated in Su(var)3-9 mutants are correspond- 
ingly up- or down-regulated in Setdbl mutants. These 
results suggest that a redundancy exists between these 
two proteins, in which both proteins, to some extent, 



have the ability to be redirected to other locations when 
needed, as we know that Su(var)3-9 has a chromosome 4 
binding capacity. Alternatively, the HPla system might 
affect a number of genetic networks so that even if differ- 
ent regions are affected by Su(var)3-9 and Setdbl, the 
same genetic networks may be indirectly affected. 
Because Su(var)3-9 affects larger regions than Setdbl, it 
is likely that more HPla will be released and redirected to 
other regions in the Su(var)3-9 mutant than in a Setdbl 
mutant, thus causing repression of genes normally 
unbound by HPla. This would explain why more genes 
are down-regulated in the Su(var)3-9 mutant compared 
with the Setdbl mutant. 

Transposon-derived transcripts are repressed by HPla 
and Su(var)3-9 

Our results provide strong support for the suggested model 
in which transposons are repressed by HP1 proteins, as 
shown for HPla (63), the HP1 homolog Rhino (64,65) 
and also Su(var)3-9 (56). In contrast, neither SETDB1 
nor POF had any effects on transposon expression. 
Because SETDB1 is known to have a role in repression 
of chromosome 4, one could speculate that SETDB1 
has a greater influence on repression of transposons 
located specifically on chromosome 4 than in other 
parts of the genome. However, due to the repetitive 
nature of the transposons and the methods used here, 
it was not possible to distinguish effects for transposons 
in specific regions. 

HPla mainly affects different gene types on chromosome 
4 compared with pericentromeric regions 

The observation that chromosome 4 displays a stronger 
effect on NUEGs, both in terms of down-regulation in 
the Pof mutant and up-regulation in the HPla mutant, 
is supported by previous findings on chromosome 4 (28). 
One potential explanation for this is that NUEGs have 
evolved to respond to a regulatory mechanism, whereas 
UEGs are more robust in expression. Although weak, it 
is noteworthy that the effect of the HPla mutant in the 
pericentromeric regions (decreased gene expression) 
was slightly stronger for UEGs than NUEGs; this is in 
line with the relatively strong binding peak found in pro- 
moters compared with the gene body in pericentromeric 
UEGs, as it has been proposed that HPla in the promoter 
has a stimulating effect and HPla in the gene body of 
chromosome 4 genes has an inhibiting effect (10,66). 
Note that the number of NUEGs exceeds the number of 
UEGs on a whole-genome level and on chromosome 4, 
whereas in pericentromeric regions, the UEGs are 
over-represented. 

In summary, our data support a model in which HPla 
binding to promoters in general has a positive function for 
transcriptional output, whereas HPla binding in gene 
bodies has a negative function. If binding in the gene 
body is relatively large compared with binding to the 
promoter, the negative function will dominate. In 
contrast, if the binding to the promoter is stronger than 
the gene body, the stimulating effect will be larger, albeit 
not dominating. A reduction in HPla levels will initially 
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affect the low-affinity gene binding and sequentially, the 
loss of HP la will also affect the promoter binding. 

HPla effects depend on gene length 

We showed that the average binding level of HPla is 
constant irrespective of gene length (the HPla binding 
per length unit is constant), implying that longer genes 
have more HPla molecules bound in total. This finding, 
in combination with the suggestion that the repressive 
effect of HPla is mainly observed in the gene body, 
could explain the greater de-repression of longer genes. 
Stronger binding of HPla to long genes has also been 
suggested by de Wit et al. (67). In addition, some chroma- 
tin marks mostly associated with active chromatin have 
been shown to bind differently to different gene lengths 
(51), suggesting that gene length affects the level of asso- 
ciation with chromatin marks. Furthermore, there are 
indications that HP1 proteins are involved with transcrip- 
tion machinery; the mammalian HP1 isoform gamma 
and H3K9me3 regulate transcriptional activation by 
associating with the RNA polymerase II (RNP2) (68), 
and HP1 can interact with and guide the recruitment of 
the histone chaperone complex FACT to active genes, 
which facilitates RNP2 transcription elongation. This, 
along with our findings, suggests a mechanism in which 
HPla is involved in transcriptional elongation. We specu- 
late that HPla slows down the progression of the RNP2 
through the length of the gene body. HPla binding mech- 
anisms could also be connected with RNA interactions, as 
HPla has been shown to directly interact with RNA tran- 
scripts and heterogeneous nuclear ribonucleoproteins (38), 
and HPla association to centric regions in Drosophila 
and mice is sensitive to RNase treatment (34,69). 

Interestingly, we discovered that a group of non- 
annotated short genes (<0.5 kb) were repressed by 
HPla, even though the HPla binding levels appeared to 
be low (which might be explained by technical aspects 
in determining the binding levels). The lack of annotation 
and lower wild-type expression level suggest that this 
group consists of many short genes encoding ncRNAs. 

Position of genes within pericentromeric regions affects 
level of HPla binding and effect 

In the pericentromeric regions of the genome, we observed 
an interesting connection between the binding and 
stimulating effects of HPla and the position of the gene; 
the closer the gene is located to the centromeric chroma- 
tin, the more strongly HPla binds and stimulates it. 

In conclusion, we found that HPla has opposite func- 
tions in different genomic regions, repressing expression 
on chromosome 4 and stimulating expression in peri- 
centromeric regions. Furthermore, the targets of 
Su(var)3-9 and SETDB1 are considerably more redundant 
than previously reported, and the overlap between HPla, 
Su(var)3-9 and SETDB1 on chromosome 4 genes is exten- 
sive. It is however important to note that the different 
effects caused by HPla, SETDB1, Su(var)3-9 and POF 
could all be interrelated to create a balanced genome. 
Therefore, it is hard to distinguish the separate effects 
caused by the different proteins. 
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